Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia

نویسندگان

  • Simone Paolo Ponzetto
  • Roberto Navigli
چکیده

We present a knowledge-rich methodology for disambiguating Wikipedia categories with WordNet synsets and using this semantic information to restructure a taxonomy automatically generated from the Wikipedia system of categories. We evaluate against a manual gold standard and show that both category disambiguation and taxonomy restructuring perform with high accuracy. Besides, we assess these methods on automatically generated datasets and show that we are able to effectively enrich WordNet with a large number of instances from Wikipedia. Our approach produces an integrated resource, thus bringing together the fine-grained classification of instances in Wikipedia and a wellstructured top-level taxonomy from WordNet.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distinguishing between Instances and Classes in the Wikipedia Taxonomy

This paper presents an automatic method for differentiating between instances and classes in a large scale taxonomy induced from the Wikipedia category network. The method exploits characteristics of the category names and the structure of the network. The approach we present is the first attempt to make this distinction automatically in a large scale resource. In contrast, this distinction has...

متن کامل

Using Goi-Taikei as an Upper Ontology to Build a Large-Scale Japanese Ontology from Wikipedia

We present a novel method for building a large-scale Japanese ontology from Wikipedia using one of the largest Japanese thesauri, Nihongo Goi-Taikei (referred to hereafter as “Goi-Taikei”) as an upper ontology. First, The leaf categories in the Goi-Taikei hierarchy are semi-automatically aligned with semantically equivalent Wikipedia categories. Then, their subcategories are created automatical...

متن کامل

Deriving a Large-Scale Taxonomy from Wikipedia

We take the category system inWikipedia as a conceptual network. We label the semantic relations between categories using methods based on connectivity in the network and lexicosyntactic matching. As a result we are able to derive a large scale taxonomy containing a large amount of subsumption, i.e. isa, relations. We evaluate the quality of the created resource by comparing it with ResearchCyc...

متن کامل

WikiTaxonomy: A Large Scale Knowledge Resource

We present a taxonomy automatically generated from the system of categories in Wikipedia. Categories in the resource are identified as either classes or instances and included in a large subsumption, i.e. isa, hierarchy. The taxonomy is made available in RDFS format to the research community, e.g. for direct use within AI applications or to bootstrap the process of manual ontology creation.

متن کامل

The importance of cross-lingual information for matching Wikipedia with the Cyc ontology

In this paper we try to answer the question how cross-lingual evidence may improve matching between di erent classi cation schemas. We concentrate speci cally on the task of mapping between Wikipedia categories and Cyc terms as well as the classi cation of Wikipedia articles to the Cyc taxonomy and show how this process may be improved by consuming the evidence that is available in di erent edi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009